library(tidyverse)
library(plotly)
library(lubridate) # Used for extracting an "hour" and "day" column from the "date_time" column
library(rpart) # Used for creating the decision tree
library(rpart.plot) # Used for creating a visualization for the decision tree
# Assigning the data
metro_raw <- read_csv("~/RStudio_Projects/Metro_Interstate_Traffic_Volume.csv")DATA 420 - Final Assignment
Class: Spring 2025 Predictive Analytics (DATA-420-DAA)
Instructor: Jason Pemberton
Student: Brandon Carine
Topic: Metro Interstate Traffic in the Twin Cities Area of Minnesota
Date: August 21, 2025
1 Introduction
Driving is the most common form of transportation by far in the United States. As populations increase and technology gets better, people want to be more informed about the roads they use to commute. When will I arrive? Will the roads be backed up when I leave the house? What can I expect traffic to be like?
The data set I chose provides information on traffic, time, and weather. The data was recorded along the I-94 highway (westbound). This means the location is the main highway connecting St. Paul to its twin city of Minneapolis. This project follows the CRISP-DM method, starting with Business Understanding and finishing with a conclusion.
2 Business Understanding
TwinTraffic is a fictitious company with an app about driving in Minnesota, with a focus on the cities of St. Paul and Minneapolis. The app has different functions, including weather-related updates, information on driver safety, and a real-time traffic map of the two cities. The company wants to provide users with more information with regard to each of these functions.
Figure 1. TwinTraffic logo [1]
These are the questions that are focused on:
Question 1: How does weather affect traffic volume, and can this be used to inform users?
Question 2: What weather-related events can be tied together to inform users about dangerous driving conditions throughout the day?
Question 3: Can we predict high traffic volumes to provide notifications to drivers before they start their commute?
These three problems will be answered using Linear and Multiple Regression, K-Means, and a Decision Tree.
3 Data Understanding
3.1 Loading Libraries and the Data
3.2 Looking at the Data
3.2.1 General Information
str(metro_raw)Name: Metro_Interstate_Traffic_Volume.csv
Source: https://archive.ics.uci.edu/dataset/492/metro+interstate+traffic+volume [2]
File Size: 3.1 MB
Rows: 48,204
Columns: 9
Time Range: October 2, 2012 9:00 AM to September 30, 2018 11:00 PM
3.2.2 Data Dictionary
| Number | Column Names | Type | Description |
|---|---|---|---|
| 1 | holiday | Character | US National holiday observed |
| 2 | temp | Number (float) | Average temperature recorded in the hour, measured in Kelvin |
| 3 | rain_1h | Number (float) | Total rain recorded in the hour, measured in mm |
| 4 | snow_1h | Number (float) | Total snow recorded in the hour, measured in mm |
| 5 | clouds_all | Number (int) | Percentage of the sky covered by clouds |
| 6 | weather_main | Character | Weather described in one word |
| 7 | weather_description | Character | Weather described with more detail |
| 8 | date_time | Date_time | Date and hour of recorded data instance |
| 9 | traffic_volume | Number (int) | Total number of cars in the hour |
3.3 Statistics
3.3.1 Summary Statistics
summary(metro_raw)Some extreme outliers were discovered.
rain_1h - Max. states a value of 9831.3. Given that the measurement is mm, I highly doubt that almost 10 metres of rain fell in one hour. This instance is found at row 24,873 (July 11, 2016 at 5 PM), and it will be deleted during data preparation.
temp - Min. states a value of 0.0. Again, given that the measurement is Kelvin, this means Minnesota was experiencing absolute zero, which doesn’t make any sense. Ten instances were discovered and will be deleted during data preparation.
snow_1h - Max. states a value of 0.51. Again, given that the measurement is mm, this becomes highly suspect. I find it hard to believe Minnesota had a maximum of half a mm of snow in one hour from 2012 to 2018. However, I will not delete this column in data preparation because there is a possibility that the sensor used for collecting the data only recorded snow accumulation on the road surface and it melted very fast.
3.3.2 Boxplot of Traffic Volume
# EDA for traffic_volume: Boxplot Visual
metro_raw %>%
ggplot(aes(y = traffic_volume)) +
geom_boxplot(fill = "#52c9e8") +
theme_minimal() +
labs(title = "Boxplot: Traffic Volume", y = "Traffic Volume")# EDA for traffic_volume: Boxplot Stats
traffic_boxplot <- boxplot.stats(metro_raw$traffic_volume)
Q1 <- traffic_boxplot$stats[2]
Q3 <- traffic_boxplot$stats[4]Traffic volume seems to have a varied distribution. Half of the data falls somewhere between 1,193 (Q1) and 4,933 (Q3) cars recorded. The median appears to be about 3,200. I will be using the Q1 and Q3 values later in data preparation.
4 Data Preparation
Note:
ChatGPT helped me come up with a way to bin the traffic volumes.
ChatGPT also introduced me to the library lubridate, which was helpful with creating the hour and day columns.
4.1 Step 1: Bin Traffic Volume
metro <- metro_raw %>%
mutate(traffic_volume_bin = case_when(
traffic_volume < Q1 ~ "Low",
traffic_volume >= Q1 & traffic_volume < Q3 ~ "Moderate",
traffic_volume >= Q3 ~ "High",
))The traffic_volume column was binned using the quartiles in Data Understanding. This new column will be used later in the Decision Tree.
4.2 Step 2: Create New Columns
metro$hour <- hour(metro$date_time)
metro$day <- wday(metro$date_time, label = TRUE) #true makes it so the day name is displayed
metro <- metro %>%
mutate(is_weekend = case_when(
day %in% c("Sat", "Sun") ~ 1,
TRUE ~ 0
))I created an hour column for use in K-Means. Using the day column, I created an is_weekend column for the Decision Tree as well.
4.3 Step 3: Convert Temperature
metro <- metro %>%
mutate(temp_c = temp - 273.15)Converted Kelvin to Celsius for easier interpretation.
4.4 Step 4: Remove Outliers
metro <- metro[-24873, ]
metro <- metro %>%
filter(temp != 0)Deleted the outlier pertaining to rain (row 24,873) and the ten instances of absolute zero temperatures.
4.5 Step 5: Select Needed Columns
metro <- metro%>%
select(traffic_volume, traffic_volume_bin, temp_c, rain_1h, snow_1h, clouds_all, hour, is_weekend)I selected only the columns that would be used in the Modeling portion of the project.
5 Method 1: Multiple Linear Regression
Question: How does weather affect traffic volume, and can this be used to inform users?
5.1 Modeling
I will perform linear regression first with each independent, numeric weather variable against traffic_volume.
5.1.1 Linear Regression
# Create First Linear Regression
temp_lm <- lm(traffic_volume ~ temp_c, data = metro)
# Assigning R-Squared, intercept, and slope of the regression
temp_r2 <- summary(temp_lm)$r.squared
temp_int <- coef(temp_lm)[1]
temp_slope <- coef(temp_lm)[2]# Adding the linear equation text
temp_text <- paste0(
"Traffic Volume = ", round(temp_int, 2),
" + ", round(temp_slope, 2), " * Temperature(C)\n",
"R^2 = ", round(temp_r2, 3)
)
#Scatterplot of First Linear Regression
metro %>%
ggplot(aes(temp_c, traffic_volume))+
geom_point(colour = "#002d5d")+
geom_smooth(method = lm, se = FALSE, colour = "#52c9e8")+
annotate("text", x = 5, y = 8000, label = temp_text, hjust = 0, size = 3)+
labs(
title = "Temperature(C) vs Traffic Volume",
x = "Temperature(C)",
y = "Traffic Volume"
) +
theme_minimal()Positive correlation, for every temperature increase (Celsius) there will be about 21 more cars on the road. Model is weak, with an R-squared value of 0.017.
# Create Second Linear Regression
rain_lm <- lm(traffic_volume ~ rain_1h, data = metro)
# Assigning R-Squared, intercept, and slope of the regression
rain_r2 <- summary(rain_lm)$r.squared
rain_int <- coef(rain_lm)[1]
rain_slope <- coef(rain_lm)[2]# Adding the linear equation text
rain_text <- paste0(
"Traffic Volume = ", round(rain_int, 2),
" + ", round(rain_slope, 2), " * Rain(mm)\n",
"R^2 = ", round(rain_r2, 3)
)
#Scatterplot of Second Linear Regression
metro %>%
ggplot(aes(rain_1h, traffic_volume))+
geom_point(colour = "#002d5d")+
geom_smooth(method = lm, se = FALSE, colour = "#52c9e8")+
annotate("text", x = 30, y = 7000, label = rain_text, hjust = 0, size = 3)+
labs(
title = "Rain(mm) vs Traffic Volume",
x = "Rain(mm)",
y = "Traffic Volume"
) +
theme_minimal()Negative correlation, for every mm of rain there will be about 44 fewer cars on the road. Model is weak, with an R-squared value of 0.001.
# Create Third Linear Regression
snow_lm <- lm(traffic_volume ~ snow_1h, data = metro)
# Assigning R-Squared, intercept, and slope of the regression
snow_r2 <- summary(snow_lm)$r.squared
snow_int <- coef(snow_lm)[1]
snow_slope <- coef(snow_lm)[2]# Adding the linear equation text
snow_text <- paste0(
"Traffic Volume = ", round(snow_int, 2),
" + ", round(snow_slope, 2), " * Snow(mm)\n",
"R^2 = ", round(snow_r2, 3)
)
#Scatterplot of Third Linear Regression
metro %>%
ggplot(aes(snow_1h, traffic_volume))+
geom_point(colour = "#002d5d")+
geom_smooth(method = lm, se = FALSE, colour = "#52c9e8")+
annotate("text", x = 0.25, y = 7000, label = snow_text, hjust = 0, size = 3)+
labs(
title = "Snow(mm) vs Traffic Volume",
x = "Snow(mm)",
y = "Traffic Volume"
) +
theme_minimal()Positive correlation, for every mm of snow there will be about 177 more cars on the road. Model is extremely weak, with an R-squared value of 0.
I have two possible explanations for this:
- People drive slower when it’s snowing and therefore there’s more cars on the road. Traffic is more gridlocked.
- This model is extremely weak and therefore may not be meaningful at all.
# Create Fourth Linear Regression
cloud_lm <- lm(traffic_volume ~ clouds_all, data = metro)
# Assigning R-Squared, intercept, and slope of the regression
cloud_r2 <- summary(cloud_lm)$r.squared
cloud_int <- coef(cloud_lm)[1]
cloud_slope <- coef(cloud_lm)[2]# Adding the linear equation text
cloud_text <- paste0(
"Traffic Volume = ", round(cloud_int, 2),
" + ", round(cloud_slope, 2), " * Cloud Coverage(%)\n",
"R^2 = ", round(cloud_r2, 3)
)
#Scatterplot of Fourth Linear Regression
metro %>%
ggplot(aes(clouds_all, traffic_volume))+
geom_point(colour = "#002d5d")+
geom_smooth(method = lm, se = FALSE, colour = "#52c9e8")+
annotate("text", x = 50, y = 8000, label = cloud_text, hjust = 0, size = 3)+
labs(
title = "Cloud Coverage(%) vs Traffic Volume",
x = "Cloud Coverage(%)",
y = "Traffic Volume"
) +
theme_minimal()Positive correlation, for every percent of cloud coverage there will be about 3 more cars on the road. Model is weak, with an R-squared value of 0.004.
5.1.2 Multiple Linear Regression
After seeing very weak R-squared values for each individual independent variable, I decided to perform multiple regression.
weather_model <- lm(traffic_volume ~ temp_c + rain_1h + snow_1h + clouds_all, data = metro)
best_weather_model <- step(weather_model, direction = "both")summary(best_weather_model)
Call:
lm(formula = traffic_volume ~ temp_c + rain_1h + clouds_all,
data = metro)
Residuals:
Min 1Q Median 3Q Max
-3793.1 -1931.3 113.9 1641.0 4827.4
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2868.0467 16.0337 178.88 <2e-16 ***
temp_c 22.8038 0.7109 32.08 <2e-16 ***
rain_1h -84.2778 8.9756 -9.39 <2e-16 ***
clouds_all 4.4172 0.2314 19.09 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1961 on 48189 degrees of freedom
Multiple R-squared: 0.02604, Adjusted R-squared: 0.02598
F-statistic: 429.5 on 3 and 48189 DF, p-value: < 2.2e-16
5.1.3 Modeling Results
The best model used the temperature, rainfall, and cloud coverage to predict traffic volume. The snowfall variable was deemed to worsen the model.
The model is Traffic Volume = 2868.05 + (22.80 * temp_c) + (-84.28 * rain_1h) + (4.42 * clouds_all).
For every increase in temperature (Celsius) 23 more cars appear on the road. For every increase in rain (mm) 84 fewer cars appear on the road. For every increase in cloud coverage (percentage) 4 more cars appear on the road.
This model is weak, with an adjusted R squared value of 0.026.
These three independent variables, along with the intercept, are statistically significant to the model, given all of their P-Values were well below the threshold of 0.05.
5.2 Evaluation
The final model chosen is the Traffic Volume model from the Results section of Multiple Linear Regression.
This model is weak, with the best multiple regression model saying that the given weather variables only explain about 2.6% of the variation in traffic volume. This tells me that traffic on the main highway connecting Minneapolis and St. Paul remains steady regardless of weather conditions, because many people still need to commute to work.
5.3 Deployment
Therefore for the app, I would inform users that weather does not significantly predict traffic volume. I would also inform them that even though this is the case, it does not mean driving conditions will not be poor. This will be explored more in the next method.
Figure 2. TwinTraffic Weather and Traffic Information [3]
Note: My suggestion for TwinTraffic is to update how they collect their snowfall data. Snowfall accumulation on a very busy road will more often than not produce low values, as the road is almost always heated. Therefore, my recommendation would be for them to record total snowfall for the hour with no melting. I believe traffic is more impacted based on what people see on their weather forecasts and in the sky rather than what is actually on the road.
6 Method 2: K-Means Clustering
Question:
What weather-related events can be tied together to inform users about dangerous driving conditions throughout the day?
6.1 Modeling
I will perform k-means clustering using hour, along with combinations of traffic volume, rain, and temperature.
Note: I understand k-means performs distance calculations based on numbers, but hour is technically a time variable. Most hours are fine, as there are no differences to their distance. However, when coming to 23 and 0, k-means will treat them as having a great distance, when in reality they are only one hour apart. For my analysis, this will not be a problem because my goal is to find groupings based on general times of day. I believe the model will still find useful groupings.
Also, the snow_1h variable was not considered in this method based on the points made in Data Understanding.
# Grab the data used for K-Means
metro_kmean_1 <- metro %>%
select(hour, traffic_volume, rain_1h)
# Scale the data
metro_scaled_1 <- metro_kmean_1 %>%
mutate(across(everything(), scale))
# Determine best K
set.seed(123)
wss_1 <- map_dbl(1:10, function(k) {
kmeans(metro_scaled_1, centers = k, nstart = 25)$tot.withinss
})
tibble(k = 1:10, wss_1 = wss_1) %>%
ggplot(aes(x = k, y = wss_1)) +
geom_line() +
geom_point() +
scale_x_continuous(breaks = 1:10) + # <- forces integer ticks
labs(title = "Elbow Method for Optimal K",
x = "Number of Clusters", y = "WCSS") +
theme_minimal()I selected a k-value of 4 based on the elbow plot. It goes from vertical to horizontal mostly from that point.
# Create the model
set.seed(123)
kmeans_result_1 <- kmeans(metro_scaled_1, centers = 4, nstart = 25)
# Assign the clusters
metro_clustered_1 <- metro_kmean_1 %>%
mutate(cluster = factor(kmeans_result_1$cluster))
# Create 3D scatterplot using plotly
plot_ly(
data = metro_clustered_1,
x = ~hour,
y = ~traffic_volume,
z = ~rain_1h,
color = ~cluster,
colors = c("#00204c", "#ff6b6b", "#006d4d", "#91e2ee"),
type = "scatter3d",
mode = "markers",
marker = list(size = 4)
) %>%
layout(
title = list(text = "Clustering for Hour, Traffic, and Rain"),
margin = list(t = 80),
scene = list(
xaxis = list(title = "Hour"),
yaxis = list(title = "Traffic Volume"),
zaxis = list(title = "Rain(mm)")
)
)Red - Afternoon to night, low to moderate traffic, low rain. (Safe after-work drive)
Green - Early morning, low traffic, low rain. (Very safe, early-risers drive)
Dark Blue - Varying time, varying traffic, high rain. (Wet roads all day)
Light Blue - Morning to evening, moderate to high traffic, low rain. (Normal, busy commute)
# Grab the data used for K-Means
metro_kmean_2 <- metro %>%
select(hour, temp_c, traffic_volume)
# Scale the data
metro_scaled_2 <- metro_kmean_2 %>%
mutate(across(everything(), scale))
# Determine best K
set.seed(123)
wss_2 <- map_dbl(1:10, function(k) {
kmeans(metro_scaled_2, centers = k, nstart = 25)$tot.withinss
})
tibble(k = 1:10, wss_2 = wss_2) %>%
ggplot(aes(x = k, y = wss_2)) +
geom_line() +
geom_point() +
scale_x_continuous(breaks = 1:10) + # <- forces integer ticks
labs(title = "Elbow Method for Optimal K",
x = "Number of Clusters", y = "WCSS") +
theme_minimal()I selected a k-value of 3 based on the elbow plot. It goes from vertical to horizontal mostly from that point.
# Create the model
set.seed(123)
kmeans_result_2 <- kmeans(metro_scaled_2, centers = 3, nstart = 25)
# Assign the clusters
metro_clustered_2 <- metro_kmean_2 %>%
mutate(cluster = factor(kmeans_result_2$cluster))
# Create 3D scatterplot using plotly
plot_ly(
data = metro_clustered_2,
x = ~hour,
y = ~temp_c,
z = ~traffic_volume,
color = ~cluster,
colors = c("#00204c", "#ff6b6b", "#006d4d"),
type = "scatter3d",
mode = "markers",
marker = list(size = 4)
) %>%
layout(
title = list(text = "Clustering for Hour, Temperature(C), and Traffic Volume"),
margin = list(t = 80),
scene = list(
xaxis = list(title = "Hour"),
yaxis = list(title = "Temperature(C)"),
zaxis = list(title = "Traffic Volume")
)
)Red - Morning to night, cold, low to high traffic. (Normal winter driving)
Green - Early morning, varying temperature, low traffic. (Very safe, early-risers drive)
Dark Blue - Morning to night, warm, low to high traffic. (Normal summer driving)
# Grab the data used for K-Means
metro_kmean_3 <- metro %>%
select(hour, temp_c, rain_1h)
# Scale the data
metro_scaled_3 <- metro_kmean_3 %>%
mutate(across(everything(), scale))
# Determine best K
set.seed(123)
wss_3 <- map_dbl(1:10, function(k) {
kmeans(metro_scaled_3, centers = k, nstart = 25)$tot.withinss
})
tibble(k = 1:10, wss_3 = wss_3) %>%
ggplot(aes(x = k, y = wss_3)) +
geom_line() +
geom_point() +
scale_x_continuous(breaks = 1:10) + # <- forces integer ticks
labs(title = "Elbow Method for Optimal K",
x = "Number of Clusters", y = "WCSS") +
theme_minimal()I selected a k-value of 4 based on the elbow plot. It goes from vertical to horizontal mostly from that point.
# Create the model
set.seed(123)
kmeans_result_3 <- kmeans(metro_scaled_3, centers = 4, nstart = 25)
# Assign the clusters
metro_clustered_3 <- metro_kmean_3 %>%
mutate(cluster = factor(kmeans_result_3$cluster))
# Visualization
# Create 3D scatterplot using plotly
plot_ly(
data = metro_clustered_3,
x = ~hour,
y = ~temp_c,
z = ~rain_1h,
color = ~cluster,
colors = c("#00204c", "#ff6b6b", "#006d4d", "#91e2ee"),
type = "scatter3d",
mode = "markers",
marker = list(size = 4)
) %>%
layout(
title = list(text = "Clustering for Hour, Temperature(C), and Rain"),
margin = list(t = 80),
scene = list(
xaxis = list(title = "Hour"),
yaxis = list(title = "Temperature(C)"),
zaxis = list(title = "Rain(mm)")
)
)Red - Morning, all temperatures, low rain. (Safe, morning commute)
Green - All day, warm, high rain. (Wet, slippery summer days)
Dark Blue - Afternoon to night, warm, low rain. (Normal after-work, summer drive)
Light Blue - All day, cold, low rain. (Cold, winter driving)
6.2 Evaluation
The final model I have decided to go with is the Hour, Temperature, and Rain model because the goal of this method was to identify potentially dangerous driving conditions. This model gives a clear understanding of physical road conditions. I absolutely would have used the snowfall data, as it is a huge cause for dangerous driving conditions, however I deemed it to be unreliable as mentioned before.
6.3 Deployment
Using this Hour, Temperature, and Rain model I can make some suggestions. The following groupings were made:
Red Cluster - Morning drive will be safe, no rain expected.
Green Cluster - Summer rain is expected today, roads may be slippery, drive with caution.
Dark Blue Cluster - Warm weather and clear skies for afternoon/evening drive, safe driving conditions.
Light Blue Cluster - Cold temperatures all day, even without rain, the roads could be slick in the morning or night, drive with caution.
Figure 3. TwinTraffic Notification: Red Cluster [3], [4]
Figure 4. TwinTraffic Notification: Green Cluster [3], [4]
Figure 5. TwinTraffic Notification: Dark Blue Cluster [3], [4]
Figure 6. TwinTraffic Notification: Light Blue Cluster [3], [4]
7 Method 3: Decision Tree
Question:
Can we predict high traffic volumes to provide notifications to drivers before they start their commute?
7.1 Modeling
I will create a classification decision tree using hour and is_weekend. The target variable is the binned traffic_volume. The categories are Low, Moderate, and High.
7.1.1 Predictions
# Choose what to predict
table(metro$traffic_volume_bin)
High Low Moderate
12060 12043 24090
# Create data partitions
set.seed(123)
train_index <- sample(1:nrow(metro), 0.7 * nrow(metro)) # randomly sample 70% of the rows, starting at row 1
train_data <- metro[train_index, ] # only grabs the rows with those indexes, and all columns
test_data <- metro[-train_index, ] # grabs everything except those indexed rows
# Fit a decision tree (Classification)
tree_model <- rpart(traffic_volume_bin ~ hour + is_weekend, data = train_data, method = "class")
predictions <- predict(tree_model, test_data, type = "class") # run predict function(model, data, type), gives the True positive, true negative type results
# Makes a table just to see how much the train prediction got right using the test data
confusionMatrix <- table(Predicted = predictions, Actual = test_data$traffic_volume_bin)
# Just runs the table
confusionMatrix Actual
Predicted High Low Moderate
High 2873 23 526
Low 0 3179 172
Moderate 723 375 6587
# Assigns the accuracy of the model to a name
accuracy <- sum(predictions == test_data$traffic_volume_bin) / nrow(test_data) * 100
# Print the accuracy
print(paste("Accuracy:", round(accuracy,3), "%"))[1] "Accuracy: 87.419 %"
7.1.2 Plot
# Plots the model, rpart.plot(model, main = "Title you want")
rpart.plot(tree_model, type =5, main = "Decision Tree for Traffic Volume")7.2 Evaluation
This was an extremely accurate model. After training the model on 70% of the data and testing it on the remaining 30%, the decision tree achieved an accuracy of 87.419%. This model will therefore be used in the TwinTraffic app.
There are two leaves that result in High traffic volume.
- Weekdays, from 6 AM to before 10 AM (third leaf from the left).
- Weekdays, from 2 PM to before 6 PM (second leaf from the left).
7.3 Deployment
These match up perfectly with the morning commute and after-work rush hour. The app will trigger alerts for high-traffic during these peak times.
Figure 7. TwinTraffic Alert [3], [4]
8 Conclusion
This project used weather-related variables and time-related variables and went into detail about how they influence traffic volume and driving conditions in the Twin Cities area from 2012 to 2018.
From the data we found out that:
Weather did not explain much of the variation in traffic volume.
There were clusters of driving conditions, useful for informing app users.
High volume traffic could be alerted to users based on time and if it was the weekend or not.
Using data, TwinTraffic can make data-driven decisions and bring better information to their users.
9 References
| Ref. Num. | Name | Link | Notes |
|---|---|---|---|
| [1] | TwinTraffic logo | https://chatgpt.com/ | AI-generated by ChatGPT (OpenAI), created on 2025-08-18. |
| [2] | Metro_Interstate_Traffic_Volume.csv | https://archive.ics.uci.edu/dataset/492/metro+interstate+traffic+volume | This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. |
| [3] | Mobile phone mockup | https://www.canva.com/templates/EAFHKP1CWnU-cream-minimalist-notification-reminder-message-instagram-story/ | https://www.canva.com/policies/content-license-agreement/ |
| [4] | Wallpaper of Minneapolis | https://commons.wikimedia.org/wiki/File:Minneapolis_%2849674291772%29.jpg | This file is licensed under the Creative Commons Attribution-Share Alike 2.0 Generic license. |
Note:
ChatGPT helped a lot with the final formatting of this report, guiding me with how to create tabs and organizing pictures.